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(57) Abstract 

For encoding sound received as a stream of 
multibit input samples, from a finite length se- 
quence of input samples, an instantaneous audi- 
bility threshold characteristic is detexmined. Next, 
a shaped, dither determined, signal is subtracted 
from the input samples to produce processing sam- 
pies. Subtractmg a dither signal dynamically ig- 
nores processing sample bits below the threshold. 
Next, quantizing by a variable number of bits be- 
low the threshold is done, while retaining all sam- 
ple bits above the threshold. The ignored bits are 
replaced by the dither signal as buried channel bits 
at an adjustable number of bits per sample. Therefor, upgraded outputted samples have non-ignored bits and buried channel bits. The 
noise Is difference through shape-filtering a difference between the upgraded samples and the processing samples, which shape^filterlng 
may amend a received whlte-noise-like signal towards an actual threshold-versus-frequency characteristic. 
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Method and apparatus for encoding multibit coded digital sound through subtracting ad^tive 
dither, inserting buried channel bits and filtering, and encoding and decoding apparatus for 
use with this method. 



FIELD OF THE INVENTION 

The invention relates to a method for encoding sound received as a stream 
of multibit sampleis while inserting buried channel bits. Such a method has been described in 
a publication by M.A. Gerzon and P.G. Craven, *A High-Rate Buried Channel for Audio 
CD', preprint 3551 of the Audio Eng. Soc. Conv. Berlin, March 1993. See also International 
5 Patent ^plication WO 94/03988, priority August 5, 1992, published February 17, 1994 to 
the same authors. The first reference bases on a relatively simple way of adding a buried 
channel through subtractively dithered noise-shaped quantization. Although the added feature 
allows for enhancing the transmitted high quality sound by a buried channel, the present 
inventors have discovered that the reference does not fiilly exploit the potentials of the 
10 'subtraction' principle, which omission leads to either a lower than optimum transfer capacity 
of the buried channel, or alternatively, to a lower than intended residual perceptive quality of 
the originally high-quality sound, for example CD sound, but not being limited thereto. 

SUMMARY TO THE INVENTION 
15 In consequence, amongst other things, it is an object of the present 

invention to improve the characteristics of the buried channel inclusion for optimizing both 
the residual perceptive quality and the transfer capacity of the buried sound channel. Now, 
according to one of its aspects, the invention is characterized in that the steps of: 

- constituting a finite length sequence of said input samples and in said sequence determining 
20 an instantaneous non-uniform audibility threshold-versus-frequency characteristic; 

- subtracting a shaped noise signal from said input samples to produce processing samples; 

- dynamically ignoring processing sample bits below a bit level associated to said 
characteristic, through subtracting a dither signal (v) and subsequent quantizing by a variable 
number of b bits below said bit level, but retaining at least all processing sample bits above 

25 said bit level; 

- replacing such ignored processing sample bits by said dither signal as buried channel bits 
featuring an adjustable number of b bits per said processing sample; 

- ou^utting upgraded samples (y) compijising non-ignored processing sample bits and buried 
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diannel bits; 

- whUe producing said noise signal through shape-filtering by a noise shaping filter on a 
dither-determined difference between said upgraded samples and said processing samples, 
which shape-filtering amends a received difference signal towards said instantaneous 
5 thre^old-versus-frequency characteristic. 

In particular, the shape-filtning of the difference towards the 
instantaneous threshold-versus-firequency chanu:tBristic aUows for creating "spectrum space" 
at those frequencies where the human hearing system is relatively insensitive. The result is 
that for a rather sizable length of samples a uniform numbo- of buried channel bits per 
10 sample can be injected. For a subsequent sequence, the number of buried channel bits has to 
be determined again. An extended feature is that next to shaping the characteristic of the 
main channel, also the characteristic of the buried channel can be shaped in similar manner 
to optimize transfer capacity. In particular, the dither signal would have to be shaped. TTus 
adds a certain amount of complexity to the system. 

Advantageously, the noise-shaping filter receives a difference signal diat 
approaches a white-noise-like characteristic. In many cases, the requirements to the 
difference signal are not high with respect to its possible influence on the quaUty of the main 
channel, and thraefore, the dither may have an arbitrary content. In other situations, the 
difference may not be correlated with the information of the main channel, or may not even 
20 be self-correlated. In the latter situations, the dither is preprocessed to get the ^ropriate 
absence of corrdation. Such randomizing measures by itsdf are well-known. 

Advantageously, said producing is done by a quantize- filter with a 
variable characteristic. In this way, improved adaptation to said instantaneous audibility and 
improved subjective audio quality are reached. 
25 Advantageously, said shape-filtering is done witii a filler having an overall 

filter curve compounded from a series of elementary filter curves each positioned at a 
respective unique grid frequency $^.^ and having a width A^, approximating a local power 
spectral density of die overall spectrum. This is an extremely straightforward metiiod of 
modelling the power spectrum density, inter alia allowing said grid frequencies to have non- 
30 uniform spacing. In general, tiiis improves the accuracy vosus the number of grid 
fiequencies used, and thus speeds up the calculation. 

The invention also relates to an encoding !4)paratus for realizing the 
encoding and to a decoding apparatus for decoding a signal as acquired tiiough effecting of 
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the method. Various further advantages are recited in dependent claims. 

BRIEF DESCRIPTION OF THE DRAWING 

These and oth^ aspects and advantages of the invration will be described 
5 more in detail hereinafter with reference to preferred embodiments, and in particular with 
reference to the appended drawings that show: 

Figure 1 an ovoall block diagram of a device according to the invention; 
Figure 2 a subtractively dithered quantizer for use as a basis for a buried 
chaxmel racoder according to the reference enhanced according to the invmtion; 
10 Figure 3 a frequency dqpendent masked threshold through an exemplary 

sound spectrum; 

Figure 4 a first exemplary buried channel formation; 
Figure 5 a second exemplary buried channel formation; 
Figure 6 a simplified CELP encoder without pitch prediction; 
IS Figure 7 a noise shaping quantizer; 

Figure 8 an elementary filter curve. 

BRIEF DISCUSSION OF TEffi PRINCIPLES 

The buried channel technology exploits the fact that an audio or sound 

20 signal is often rq)resented by an accuracy expressed by the length of its sample bit string that 
is actually too high in terms of perceived audio quality. Therefore, the amount of information 
may be reduced for thereby freeing transfer capacity for an additional information service. 
The additional information is inserted in the least significant portion of the main signal. To a 
conventional receiver this modification of the main signal is irrelevant inasmuch as a human 

25 listener will not notice the difference. An enhanced receiver system however, will retrieve 
the additional information and produce this on a separate output. According to the invention, 
the difference spectrum is shs^-filtered for amending a received signal towards an actual 
threshold-versus-frequency characteristic. This allows for creating "spectrum space** at those 
frequencies where the human hearing system is relatively insensitive. If the difference is 

30 white-noise-like, the listener will be even less sensitive to the added channel. 



DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

Figure 1 shows an overall block diagram of a device according to the 
invention. Block 20 is a source of digital sound that is encoded as a stream of samples that 
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may consist, for example of 16 bits each, recurring at some 44 kHz. TTie sound has a 
predefined bandwidth and can have an arbitrary content such as music, speech, or other. 
Block 22 constitutes a finite length of these samples, such that those take 19 a certain interval 
of time, say, 1024 samples - 0.02 seconds, and therein determines an audibility threshoM 
5 versus ftequency characteristic. The audibffity may be determined on the basis of a limited 
number of key characteristics of the incoming audio. It depends on the instantaneous 
frequency, on masking effects on one frequency band through influence from another band, it 
depends on the general or local loudness of the sound, and it may vary between listeners, the 
last variation being generally ignored, however. The threshold may be determined in ways 
10 that by itself are known, and the result wiU be shown hereinafter. Furthennore, after 
determination of the threshold, the samples are quantized by ignoring a number' of b tow- 
significance bits thereof. Block 24 is a source for buried channel bits. The nature of the 
buried channel may be arbitrary, such as additional comment to the main channel such as 
displayable subtitles or text, an additional sound channel in a multi-channel sound 
15 reproduction, of similar or respective different quality levels, multi-lingual speech service. 
Karaoke or even video. Also, non-related services are conceivable. However, a particular' 
advantageous usage is to define the buried channel as a MPEG audio chamiel. By itself this 
standard has proven usefiil to provide high-quality audio transfer at an moderate bit-rate. 
Furthermore, the buried channel itself may consist of two or more sub-channels that are 
20 fimctionally unrelated, although together making up for the buried part of the upgraded 
samples. In block 26 the ignored sample bits from source 20. or a fraction thereof, starting 
at the lower significance levels, are rq>laced by bits from source 24. Moreover, in mutually 
spaced locations of the stream of upgraded samples, indications are inserted in the buried 
chamid as to what is or will subsequenUy be the number of ignored bits per sample, and if 
25 appUcable, when the next indication will occur. For efficient operation the spacing between 
these indications should be set at optimum value. If the spacing is too small, the overhead 
increases. If the spacing is too large, the number of ignored bits is too low as seen from the 
individual samples. The channel 28 that may have transmitting, storing or further quality, 
forwards the upgraded samples to receiver 30. Receiver 30, on the basis of these received 

30 indications can separate the standard part of die samples from the buried channd bits. TTie 
standard part of the samples is forwarded to decoder 32 that represents the audio in standard 
manner, wherein the substituted bits are maintained, as representing sub-audible noise. TTie 
buried channel bits are forwarded to a subsidiary decoder that has been programmed for 
correct processing thereof. AnoUier set-up is that the buried channel decoder receives the 
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whole samples as produced at a digital output of a channel receiver and extracts the buried 
channel information therefrom, while ignoring the standard channel. If, on the other hand a 
normal non-upgraded receiver has been provided for the channel, this will process the 
upgraded samples as if they were normal samples. This eventually ends in an analog audio 
5 amplifier that feeds some kind of speaker. The buried channel bits, being goierally 

uncorrelated to the main channel bits, now directly represent some kind of noise that remains 
below the intended auditory threshold. 

Figure 2 shows a subtractively dithered quantizer for use as a basis for a 
buried channel encoder according to the reference, as enhanced with certain features 

10 according to the invention. On input 52 a b-bit dither signal v is input in a way to be 
discussed hereinafter. Elements 54, 58, 60, 62 are digital adders-subtractors of an 
appropriate bit width. Element 56 is a quantizer that reduces the accuracy of the exemplary 
16-bit received signal x to a lower number of 16-b bits by means of truncation. Such a 
feature by itself has been described in S.P. Lipshitz et al, "Quantization and Dither: A 

15 theoretical Survey", J. Audio Eng. Soc. Vol. 40, no.5, pp.355-375. May 1992. The usage of 
a b*bit dithered signal v, if the lack of correlation with the main channel is sufficient, 
ensures that the quantization error e remains spectrally white and statistically independent 
from the input signal x, which is preferable for perceptivity reasons. The dithered signal may 
be a randomized version of the buried channel signal, without adding or sacrificing 

20 information. Such randomization can be reversed without the need for resynchronization. It is 
recognized that the random character is speciHed relative to the main channel, as well as 
within the buried channel itself. If within the context of the buried channel itself, the signal 
is well-structured, it may be randomized by conventional means. The same dither signal v is 
added in element 54 to form the compatible output signal y at output 66, for storage, 

25 transmission or further processing. Block 64 is a noise shaping filter and receives the 

difference between the compatible output signal y and the input signal before introduction of 
the dithered signal v, as produced by subtractor 62. The output signal of noise shaping filter 
64 is fed back to subtractor 60 that in its turn receives the original audio signal x. It has been 
found that the noise loudness can be decreased by about 16 dB with only a 9-th order FIR 

30 (finite impulse response) filter. This approach will make the noise level of a 2-3 bits per 
sample buried channel signal of comparable loudness as the inherent noise floor in the CD 
signal. The filter characterisdc H(Z) should be such that y, defined as 



y = X + I 1 . H(Z) I 2 . / 12 
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Should be changed with respect to x by an amount that is subjectively inconsequential to the 
listener, A = 2^ being the elementary step size. Now, the transfer rate of the buried diannd 
depends on the coarseness of the requantization operation. In this respect, p,2, L40 ff and 
p. 13, fifth para, of the first G^^on at al reference indicate that the requantization may be 
made more coarse when the resulting error is masked by a high level main audio signal. The 
present inventors, on the other hand, have discovered an evra more effective way to increase 
the transfCT rate of the buried channel, namdy by using die frequency-depmdent sensitivity 
of the human hearing system. A further refinement to the arrangement of Figure 2 is the 
buffer 50 that may temporarily store the data supplied by the buried channd 68, Jn view of 
the non-uniform rate of the buried channel at ou4>ut 66, the buffer may have some kind of 
feed-back organization that keeps its filling degree more or less constant. If the buffer g^ 
too empty, the fiill capacity of the buried channel may be surrendered in part. If the buffer 
gets too fuU, there are various strategies: one is lowering the feed rate ftom source 68. A 
more drastic one if more than one buried sub-channel is present, is die surrendering of the 
least important sub-channel therefrom. If the subchannel represents moving video, it could 
temporarily be reduced to a sequence of stills. Various modifications of die arrangement of 
Figure 2 are self-evident: for example, certain of the addition devices may be changed to 
subtraction devices. This would slightiy complicate the hardware, because of the necessary 
propagating of borrows. EspeciaUy witii certain sample notation systems, tiie impact is 
minimal, however. 

In this respect. Figure 3 shows a frequency dependent masking threshold 
tiut>ugh a first exemplary sound spectrum. Figure 4 shows a first exemplary buried channel 
formation, based on zero sound input. 

In Figure 4, for a standardized or presumed human hearing system, curve 
4 gives die audibility level, as based on single frequencies. For calculatoiy reasons, curve 4 
has bem simulated (note die straight parts tiiereof), but it closely follows die natural 
phenomena. At approximately 4000 Hz tiiis direshold is lowest, whilst being much higher at 
eitiiCT higher or lower frequencies. Now, trace 1 indicates die flat CD noise floor, diat is 
givm as 10^® log (1/12 x 220i50) dB. Now, alfliough curve 4 gives tiie audibiUQr tiireshold 
for single frequencies, for noise die audibility effects are much higher, and its frequency 
characteristic should lie much lower dian curve 4. Now, curve 2 is the spectrum of the flat 
noise of curve 1, shaped to get approximately die same frequency dependence as the 
threshold of curve 4. It has been found experimentally diat neidier the few dB deviations 
ftom the exact approximation, nor die markedly flatter shape above some 15 kHz have a 
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negative influence on the overall performance. Curve 3 is identical to curve 2, but relativdiy 
shifted upward over a distance of b'^S dB, wherein in Figure 4, h^2. This implies a buried 
channel of two (2) bits wide per sample. It has been found that the distance between curves 3 
and 4 keeps the added information unheard. The minimum distance between curves 1 and 4 
5 is 10^^ log(660), which corresponds to the critical bandwidth around 4kEIz. The design is 
preferably made with the help of an auxiliary curve that indicates the integrated power of the 
spectrum of curve 3, and which may not touch curve 4; for simplicity^ this auxiMaxy curve 
has not been shown. 

Figure 3 shows the influence of an actual sound spectrum on the shape of 

10 the audibility threshold. Now curve A is the actual sound spectrum. Curve B again shown 
the audibility threshold for single tones. Clearly, the valley-like character of curve 4 in 
Figure 4 has disappeared. 

Figure S shows a second exemplary buried channel formation. Here the 
simulated audio spectrum, that may however have a different shape from curve A in Figure 

IS 3, cause the masked audio threshold to lie at an approximate level of 40 dB with a shallow 
dip of some 10 dB at 13 kHz. For clarity, the spectrum of the audio itself has not been 
shown. As should be clear from figure 3, each separate spectrum peak may raise the masked 
threshold over a frequency width in the order of die critical frequency, which generally 
causes the smooth appearance of the threshold curve. Again, the influence of the high- 

20 frequency range over 16 kHz has been ignored. Now, curve 1 is the same as in Figure 4. 
Curve 2 is again the spectrum of the flat noise, shaped to get approximately the same 
appearance as the masked threshold of curve 4; in this case, the overall shape of curve 2 is 
much flatter than in the preceding Figure. Curve 3 is identical to curve 2, but relatively 
shifted upward over a distance of b'*'6 dB, wherein in Figure S, b=6. This implies a buried 

25 channel of six bits wide per sample. It has thus been shown that for higher sound energy the - 
width of the buried channel may increase. It has further been shown that the shape of the 
sound spectrum is crucial in many cases. With the shape of curve 2 in Figure 4 applied in 
Figure S, the improvement would have been much less in the latter Figure. The approach of 
the preceding Figures has been based on the full frequency spectrum. In certain audio 

30 systems, the spectrum has been distributed in subbands, wherein the respective subbands 
carry largely independent audio signals. In such situation, the method and apparatus of the 
preset invention can be applied separately for any subband or subband combination 
sq>arately. At the price of a somewhat higher complexity, this would furthcs* raise the 
transfer data rate of the buried channel. 



wo 95/18523 PCT/IB94/00418 

8 

COMPUTATION OF NOISE WEIGHTING FILTERS AND NOISE SHAPING FILTERS 
FROM MASKED TARGET LEVELS 

Herdnafter, a prefrared embodiment for calculating a compound filter 
curve for filter 64 in figure 2 is presoited. A relevant pubUcation in this field is E. 
5 Ordentlich and Y. Shoham, Low-dday code-excited linear-piedictive coding of wideband 
speech at 32 kbps, Proc. ICASSP-91, pp. 9-12, Toronto, 1991. By itself, the technique 
presented infra is suitable for various appUcations in different fields, such as MPE, RPE, and 
CELP. Therein, an excitation sequence (MPE, RPE) or excitation vector (CELP- codebook 
excited linear prediction) is sdected on the baas of a weighted mean square enor ciitericm. 
10 In such a coder, short output sequences are genonated ftom a number of 

excitation sequences or vectors. The generated output sequences are compared with the 
original input sequences. The criterion for comparison is the weighted mean-squared error. 
This means that the difference between input and generated ouqjut is passed through a noise 
weighting filter. The power of the filtered difference sequence is then estimated. TTiis power 
15 is called the weighted mean-squared error. The excitation sequence yielding the minimum 
weighted mean-squared error is selected. 

Figure 6 shows how a weighting filter is used in a CELP coder. From the 

incoming speech signal x[i] the box LPC analysis computes the prediction coefficients aj 

ap, the coefficients for the weighting filter and a gain factor. The codebook contains a 
number of excitation vectors. The vector length is N. During selection all vectors are 
multiplied with the gain fector and passed through an analysis filter. This results in a 
sequence of N samples denoted by i[i]. An error sequence is formed by subtracting N input 
samples x[i] fix)m N samples x[i]. The error sequem» is passed through the weighting filter. 
The weighted mean-squared error, which is the short-term power of the weighted error 
25 sequence, is computed. The selection box selects the code vector that results in the lowest 
weighted mean-squared Mror. Gain fector, prediction coefficients and the index of the 
vectors are transmitted to the decoder. 

In this context, an excitation vector is considered as an excitation 
sequence, therefore only the expression excitation sequence will be used. 
30 The commonly used weighting filters are based on the prediction 

coefficients (LPC coefficients) aj, .... ap of the speech signal. A possible form of this filter 
is 



20 



10 
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w(z) = —=±1 — ii — . — —1 (1) 

The coefficients pj and P2 are found by applying LPC analysis to the first three 
autocorrelation lags of the sequence. The coefficients 6, yi and 72 control the amount of 
weighting at the position of the formants. They are tuned such that good perceptual 
performance is obtained. Advantageous values are: 

5 = 0.7, 7i = 0,95, 72 = 0.8. 
Other similar forms are useful as well. For a well-chosen codebook, the pow^ spectral 
drasity function of the coding ^ror is proportional to 

I ^ |2 (2) 
' W(exp(/ff)) ' ' ^ ^ 



NOISE SHAPERS 

The function of a noise shaper is to give spectral shaping to quantization 
noise. Figure 7 shows the basic diagram of a noise sh^r. It can be shown that after 
decoding the power spectral density function of the quantization noise is given by 

15 5(expafl)) - |1 + F(cxp{ie))\^±A\ (3) 

where A is again the quantization step size. A commonly used filter F(z) in linear predictive 
coding of speech is 

m - T (4) 

. with 7 < 1. In that case the power spectral density function of the quantization noise is 
20 given by 



The aim of a weighting filter and of a noise shaper is spectral shaping of the coding error in 
such a way that distortion is perceptually least disturbing. In music coding several methods 
estimate the masking level of quantization noise in frequency bands. This level is called 
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masted target level. The goal of these methods is also to obtain a distortion that is 
percq)tuaUy least disturbing. However, they have a better psychoacoustical foundation than 
the weighting filter of (1) or the noise shaper of (4) and wiU therefore result in a better 
approximation of the power q)ectral density function of masked noise. 

The following combines the weighting filter technique or the noise 
shaping technique with the computation of masted target level. The weighting filters or noise 
sbapcrs that are thus obtained are better than the known ones because their transfer functions 
correspond better to the spectral shape of the masked noise. 

TARGET LEVELS AND FREQUENCY BANDS 

It is assumed that a set of target levels tj, t„ is computed in advance, 
for instance, through measurements discussed with respect to Figures 3, 4, 5. The target 
levels represent noise powers in frequency bands at masking threshold. These fiequ«icy 
bands must be adjacent and cover the range from zero to half die sampling frequency. 
Normalized frequencies 0 are used, therefore 
< e < X. 

The corresponding audio normalized frequency f follows from 

where f^ is the sampling rate. 

In the following the bandwidths may be chosen arbitrarily. In practice, 
critical bands or equaUy spaced bands will be used. The lower edge, upp^ edge and centre 
frequency of the firequency band corresponding to masked target level ^ are denoted by 
el, and reflectively. 



RECONSTRUCTION OF POWER SPECTRAL DENSITY 

A smooth power spectral density fiinction (psd): S(exp(jfl)) can be derived 
from the masked target levels by associating witii the kth frequency band a psd- 
reconstruction function Sy.(0). The psd then follows from 

5(exp(/«)) = f: r^^id). 
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There are some constraints for the psd*reconstniction functions. In the first place, the psd 
must be non-negative. This implies that 

Sy^iO) iS: 0, k = 1 m (7) 
In the second place, if power preservation, i.e. 



2x J-» 



(8) 



is required, then 

^ j 5^(0) rfO = 1, * = 1, m. (9) 

A stepwise approximation of the psd is obtained by choosing psd- 
reconstruction functions 

10 S^(0) = el^\e\ <«J^, 0, otherwise, (10) 

where » 0^ - 0^. Stepwise approximations do not result in a smooth psd. For that a 
raised-cosine psd-reconstruction function is better suited. This is given by 

Sjjiff) = -2-(l+cos(^|fl|-e;^), el - Ajt^ |9| + Ajfc, 0, otherwise (11) 
2^* ^k 

Figure 8 shows such an elementary filter curve. Also the raised-cosine psd 
IS reconstruction Amotion has its limitations. It cannot for instance reconstruct a flat psd if the 
frequency bands are unevenly spaced. This can be improved by using different upper and 
lower slopes. The choice of a psd-reconstniction function is determined by the desired 
spectral behaviour of the masked noise. 



20 APPROXIMATION OF POWER SPECTRAL DENSITY 

The reconstructed psd S(exp(j0)) is approximated by an all-pole spectrum 

5(exp(Jfl)) = (12) 

where q is the order of the all-pole spectrum. This results in a weighting filter with transfer 
function 
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1 



*«1 



(13) 



The weighting filter is an FIR filter, in contrast to the filter of (1). In the foUowing the bj, 
bq are computed from the t^, tg^, by minimizing 

V = ^ f _V(»q></»)) |B(exp(/V)) p ^9 (14) 

5 as a function of bj b,. In (14), S(e3qj(j«)) foUows from (6). By way of summarizing: it 

is required to sq^roximate the ^lectnim S(fi), wh«ein 0 is a normalized frequency. Now, 
the function B is the inverse function of F(eq.23) and Q is a constant. Now, computing 
doivatives 



afi(*i V 



, n = 1, 



3bn 

10 and setting them equal to zero leads to the following set of equations 



CKp(jn0)de=Q, 11=1,...,^. (15) 



Or, 

/-I *-l 2xJ-x 

f ' 5ii(tf)exp(/«tf)rf«. 11=1,...,^ 

kmi ^rj-r 



(16) 



Define 



15 Sk^=^\l^S^i.9)Gxp(jn8)dB, k=l,...,m, n=l,...,g (17) 



and 



m 



Pn 12 ^kgk,n* (18) 



gk,n can be computed in advance from the psd-reconstruction fiincdons and stored in an 
m X q matrix. On substitution of these results into (16) one obtains the set of equations 
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T,Pn-Pl = "Pn^ « = (19) 

lei 

This is a symmetrical^ positive-definite Toeplitz system that is identical to the Yule-Walker 
equations, known from linear predictive coding. Define the q x q matrix R by 

5 and the q vector r by 

This leads to 

Rb = -r, (20) 
where the q vector b contains the coefficients b^, b^. The set (19) or (20) is easily solved 
10 by the known Levinson-Durbin algorithm. 

EXAMPLES OF g^^^ 

For the stepwise approximation of S(exp(j0)) the g|^ ^ are given by - 

sin(n-^) 

= TT— <21) 



IS For the raised-cosine approximation of S(^Q0)) the gj^^^ are given by 
,sin(nA») ^ 



The coefficients bj, b^ can be directly applied in a weighting filter W(z), as shown in 
Figure 7. In tfiat W(z) — B(z), with B(z) defined in (13). In case of a noise shaper, tfie F(z) 
follows from 

20 l-m ' (23) 
thoefore 
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CLAIMS.: 



1* A method for encoding a sound nqnesentation leceived as a stream of 

multibit input samples, characterized by the stq>s of: 

- constituting a finite length sequence of said input samples and in said sequence determining 
an instantaneous non-uniform audibiUty threshold-versus-ftequency characteiisdc; 

5 - subtracting a shaped noise signal from said input samples to produce processing samples; 

- dynamically ignoring processing sample bits below a bit level associated to said 
characteristic, through subtracting a dither signal (v) and subsequent quantizing by a variable 
number of b bits below said bit level, but retaining at least aU processing sample bits above 
said bit level; 

10 - replacing such ignored processing sample bits by said dither signal as buried channel bits 
featuring an adjustable number of b bits per said processing sample; 

- outputting upgraded samples (y) comprising non-ignored processing sample bits and buried 
channel bits; 

- while producing said noise signal through shape-filtering by a noise shaping filter on a 
15 dither-determined difference between said upgraded samples and said processing samples, 

which shape-filtering amends a received difference signal towards said instantaneous 
tfareshold-versus-irequency characteristic. 

^' A method as claimed in Claim 1, wherein said noise-shaping filtea- 

receives a differaice signal that approaches a white-noise-Uke characteristic. 
^- A method as claimed in Claim 1 or 2, wherein said producing is done by 

a noise-shaping filter with a variable diaractraistic. 

^' A method as claimed in Claims 1, 2 or 3, and featuring detecting a 

musical transient in a particular time interval, and upon such detecting setting the value of b 
in that interval to a value at least co-determined from a neighbouring time interval not having 
25 that musical transient. 

^- A method as claimed in any of Claims 1 to 4, and featuring temporary 

buffering of data to be used as buried channel data before said replacing in a buffer and 
through a time varying rate control time-wise equalizing a filling degree of said buffer. 
^' A mefliod as claimed in any of Claims 1 to 5, wherein said buried channel 
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data is received as an MPEG audio channel. 

7. A method as claimed in any of Claims 1 to 6, wherein said shape-filteiing 
is done with a filter having an overall filter curve compounded from a series of elementary 
filter curves each positioned at a respective unique grid frequency and having a width 

S approximating a local power spectral density of the overall spectrum. 

8. A method as claimed in Claim 7, wherdn said grid frequmdes have non- 
uniform ^»cing. 

9. A method as claimed in any of Claims 1 to 8 and applied separately to 
respective frequency subbands that are coexistent in a frequency spectrum of said sound 

10 representation. 

10. An encoding apparatus for effecting a method as claimed in any of Claims 
lto9. 

IL A decoding apparatus for decoding a signal of upgraded samples produced 

through effecting a method as claimed in any of Claims 1 to 9. 
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